Buffer Overflow Protection
   HOME

TheInfoList



OR:

Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memo ...
s on
stack Stack may refer to: Places * Stack Island, an island game reserve in Bass Strait, south-eastern Australia, in Tasmania’s Hunter Island Group * Blue Stack Mountains, in Co. Donegal, Ireland People * Stack (surname) (including a list of people ...
-allocated variables, and preventing them from causing program misbehavior or from becoming serious
security" \n\n\nsecurity.txt is a proposed standard for websites' security information that is meant to allow security researchers to easily report security vulnerabilities. The standard prescribes a text file called \"security.txt\" in the well known locat ...
vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues. Typically, buffer overflow protection modifies the organization of stack-allocated data so it includes a ''
canary Canary originally referred to the island of Gran Canaria on the west coast of Africa, and the group of surrounding islands (the Canary Islands). It may also refer to: Animals Birds * Canaries, birds in the genera '' Serinus'' and ''Crithagra'' ...
'' value that, when destroyed by a stack buffer overflow, shows that a buffer preceding it in memory has been overflowed. By verifying the canary value, execution of the affected program can be terminated, preventing it from misbehaving or from allowing an attacker to take control over it. Other buffer overflow protection techniques include ''
bounds checking In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before it is used. It is usually used to ensure that a number fits into a given type (range checking), or that a variable being used as ...
'', which checks accesses to each allocated block of memory so they cannot go beyond the actually allocated space, and ''tagging'', which ensures that memory allocated for storing data cannot contain executable code. Overfilling a buffer allocated on the stack is more likely to influence program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls. However, similar implementation-specific protections also exist against heap-based overflows. There are several implementations of buffer overflow protection, including those for the
GNU Compiler Collection The GNU Compiler Collection (GCC) is an optimizing compiler produced by the GNU Project supporting various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free softwar ...
,
LLVM LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate repre ...
,
Microsoft Visual Studio Visual Studio is an integrated development environment (IDE) from Microsoft. It is used to develop computer programs including websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platforms such ...
, and other compilers.


Overview

A stack buffer overflow occurs when a program writes to a memory address on the program's
call stack In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or mac ...
outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memo ...
(or buffer overrun). Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls. Stack buffer overflow can be caused deliberately as part of an attack known as
stack smashing In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs ...
. If the affected program is running with special privileges, or if it accepts data from untrusted network hosts (for example, a public
webserver A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiate ...
), then the bug is a potential security vulnerability that allows an
attacker In some team sports, an attacker is a specific type of player, usually involved in aggressive play. Heavy attackers are, usually, placed up front: their goal is to score the most possible points for the team. In association football, attackers a ...
to inject executable code into the running program and take control of the process. This is one of the oldest and more reliable methods for attackers to gain unauthorized access to a computer. Typically, buffer overflow protection modifies the organization of data in the
stack frame In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or mach ...
of a function call to include a "canary" value that, when destroyed, shows that a buffer preceding it in memory has been overflowed. This provides the benefit of preventing an entire class of attacks. According to some researchers, the performance impact of these techniques is negligible. Stack-smashing protection is unable to protect against certain forms of attack. For example, it cannot protect against buffer overflows in the heap. There is no sane way to alter the layout of data within a
structure A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such a ...
; structures are expected to be the same between modules, especially with shared libraries. Any data in a structure after a buffer is impossible to protect with canaries; thus, programmers must be very careful about how they organize their variables and use their structures.


Canaries

''Canaries'' or ''canary words'' are known values that are placed between a buffer and control data on the stack to monitor buffer overflows. When the buffer overflows, the first data to be corrupted will usually be the canary, and a failed verification of the canary data will therefore alert of an overflow, which can then be handled, for example, by invalidating the corrupted data. A canary value should not be confused with a
sentinel value In computer programming, a sentinel value (also referred to as a flag value, trip value, rogue value, signal value, or dummy data) is a special value in the context of an algorithm which uses its presence as a condition of termination, typically in ...
. The terminology is a reference to the historic practice of using canaries in coal mines, since they would be affected by toxic gases earlier than the miners, thus providing a biological warning system. Canaries are alternately known as ''cookies'', which is meant to evoke the image of a "broken cookie" when the value is corrupted. There are three types of canaries in use: ''terminator'', ''random'', and ''random
XOR Exclusive or or exclusive disjunction is a logical operation that is true if and only if its arguments differ (one is true, the other is false). It is symbolized by the prefix operator J and by the infix operators XOR ( or ), EOR, EXOR, , ...
''. Current versions of StackGuard support all three, while ProPolice supports ''terminator'' and ''random'' canaries.


Terminator canaries

''Terminator canaries'' use the observation that most buffer overflow attacks are based on certain string operations which end at string terminators. The reaction to this observation is that the canaries are built of
null Null may refer to: Science, technology, and mathematics Computing * Null (SQL) (or NULL), a special marker and keyword in SQL indicating that something has no value * Null character, the zero-valued ASCII character, also designated by , often use ...
terminators, CR, LF, and FF. As a result, the attacker must write a null character before writing the return address to avoid altering the canary. This prevents attacks using strcpy() and other methods that return upon copying a null character, while the undesirable result is that the canary is known. Even with the protection, an attacker could potentially overwrite the canary with its known value and control information with mismatched values, thus passing the canary check code, which is executed soon before the specific processor's return-from-call instruction.


Random canaries

''Random canaries'' are randomly generated, usually from an
entropy Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
-gathering
daemon Daimon or Daemon (Ancient Greek: , "god", "godlike", "power", "fate") originally referred to a lesser deity or guiding spirit such as the daimons of ancient Greek religion and Greek mythology, mythology and of later Hellenistic religion and Hell ...
, in order to prevent an attacker from knowing their value. Usually, it is not logically possible or plausible to read the canary for exploiting; the canary is a secure value known only by those who need to know it—the buffer overflow protection code in this case. Normally, a random canary is generated at program initialization, and stored in a global variable. This variable is usually padded by unmapped pages, so that attempting to read it using any kinds of tricks that exploit bugs to read off RAM cause a segmentation fault, terminating the program. It may still be possible to read the canary, if the attacker knows where it is, or can get the program to read from the stack.


Random XOR canaries

''Random XOR canaries'' are random canaries that are XOR-scrambled using all or part of the control data. In this way, once the canary or the control data is clobbered, the canary value is wrong. Random XOR canaries have the same vulnerabilities as random canaries, except that the "read from stack" method of getting the canary is a bit more complicated. The attacker must get the canary, the algorithm, and the control data in order to re-generate the original canary needed to spoof the protection. In addition, random XOR canaries can protect against a certain type of attack involving overflowing a buffer in a structure into a pointer to change the pointer to point at a piece of control data. Because of the XOR encoding, the canary will be wrong if the control data or return value is changed. Because of the pointer, the control data or return value can be changed without overflowing over the canary. Although these canaries protect the control data from being altered by clobbered pointers, they do not protect any other data or the pointers themselves. Function pointers especially are a problem here, as they can be overflowed into and can execute
shellcode In hacking, a shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised ma ...
when called.


Bounds checking

Bounds checking is a compiler-based technique that adds run-time bounds information for each allocated block of memory, and checks all pointers against those at run-time. For C and C++, bounds checking can be performed at pointer calculation time or at dereference time. Implementations of this approach use either a central repository, which describes each allocated block of memory, or
fat pointer In computer science, dynamic dispatch is the process of selecting which implementation of a polymorphic operation (method or function) to call at run time. It is commonly employed in, and considered a prime characteristic of, object-oriented ...
s, which contain both the pointer and additional data, describing the region that they point to.


Tagging

Tagging is a compiler-based or hardware-based (requiring a tagged architecture) technique for tagging the type of a piece of data in memory, used mainly for type checking. By marking certain areas of memory as non-executable, it effectively prevents memory allocated to store data from containing executable code. Also, certain areas of memory can be marked as non-allocated, preventing buffer overflows. Historically, tagging has been used for implementing high-level programming languages; with appropriate support from the
operating system An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
, tagging can also be used to detect buffer overflows. An example is the
NX bit The NX bit (no-execute) is a technology used in CPUs to segregate areas of memory for use by either storage of processor instructions or for storage of data, a feature normally only found in Harvard architecture processors. However, the NX bit is ...
hardware feature, supported by
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
,
AMD Advanced Micro Devices, Inc. (AMD) is an American multinational semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets. While it initially manufactur ...
and
ARM In human anatomy, the arm refers to the upper limb in common usage, although academically the term specifically means the upper arm between the glenohumeral joint (shoulder joint) and the elbow joint. The distal part of the upper limb between th ...
processors.


Implementations


GNU Compiler Collection (GCC)

Stack-smashing protection was first implemented by ''StackGuard'' in 1997, and published at the 1998 USENIX Security Symposium. StackGuard was introduced as a set of patches to the Intel x86 backend of GCC 2.7. StackGuard was maintained for the
Immunix Immunix is a discontinued commercial operating system that provided host-based application security solutions. The last release of Immunix's Linux distribution was version 7.3 on November 27, 2003. Immunix, Inc. was the creator of AppArmor, an appl ...
Linux distribution from 1998 to 2003, and was extended with implementations for terminator, random and random XOR canaries. StackGuard was suggested for inclusion in GCC 3.x at the GCC 2003 Summit Proceedings, but this was never achieved. From 2001 to 2005, IBM developed GCC patches for stack-smashing protection, known as ''ProPolice''. It improved on the idea of StackGuard by placing buffers after local pointers and function arguments in the stack frame. This helped avoid the corruption of pointers, preventing access to arbitrary memory locations. Red Hat engineers identified problems with ProPolice though, and in 2005 re-implemented stack-smashing protection for inclusion in GCC 4.1. This work introduced the -fstack-protector flag, which protects only some vulnerable functions, and the -fstack-protector-all flag, which protects all functions whether they need it or not. In 2012,
Google Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
engineers implemented the -fstack-protector-strong flag to strike a better balance between security and performance. This flag protects more kinds of vulnerable functions than -fstack-protector does, but not every function, providing better performance than -fstack-protector-all. It is available in GCC since its version 4.9. All Fedora packages are compiled with -fstack-protector since Fedora Core 5, and -fstack-protector-strong since Fedora 20. Most packages in
Ubuntu Ubuntu ( ) is a Linux distribution based on Debian and composed mostly of free and open-source software. Ubuntu is officially released in three editions: '' Desktop'', ''Server'', and ''Core'' for Internet of things devices and robots. All ...
are compiled with -fstack-protector since 6.10. Every
Arch Linux Arch Linux () is an independently developed, x86-64 general-purpose Linux distribution that strives to provide the latest stable versions of most software by following a rolling-release model. The default installation is a minimal base system, ...
package is compiled with -fstack-protector since 2011. All Arch Linux packages built since 4 May 2014 use -fstack-protector-strong. Stack protection is only used for some packages in Debian, and only for the FreeBSD base system since 8.0. Stack protection is standard in certain operating systems, including OpenBSD,
Hardened Gentoo Gentoo Linux (pronounced ) is a Linux distribution built using the Portage package management system. Unlike a binary software distribution, the source code is compiled locally according to the user's preferences and is often optimized for th ...
and
DragonFly BSD DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD in ...
. StackGuard and ProPolice cannot protect against overflows in automatically allocated structures that overflow into function pointers. ProPolice at least will rearrange the allocation order to get such structures allocated before function pointers. A separate mechanism for pointer protection was proposed in PointGuard and is available on Microsoft Windows.


Microsoft Visual Studio

The compiler suite from Microsoft implements buffer overflow protection since version 2003 through the command-line switch, which is enabled by default since version 2005. Using disables the protection.


IBM Compiler

Stack-smashing protection can be turned on by the compiler flag -qstackprotect.


Clang/

LLVM LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate repre ...

Clang supports the same -fstack-protector options as GCC, plus three buffer overflow detectors, namely
AddressSanitizer AddressSanitizer (or ASan) is an open source programming tool that detects memory corruption bugs such as buffer overflows or accesses to a dangling pointer (use-after-free). AddressSanitizer is based on compiler instrumentation and directly ma ...
(-fsanitize=address), -fsanitize=bounds, and SafeCode. These systems have different tradeoffs in terms of performance penalty, memory overhead, and classes of detected bugs. Stack protection is standard in certain operating systems, including OpenBSD.


Intel Compiler

Intel's C and C++ compiler supports stack-smashing protection with options similar to those provided by GCC and Microsoft Visual Studio.


Fail-Safe C

''Fail-Safe C'' is an open-source memory-safe ANSI C compiler that performs bounds checking based on fat pointers and object-oriented memory access.


StackGhost (hardware-based)

Invented by Mike Frantzen, StackGhost is a simple tweak to the register window spill/fill routines which makes buffer overflows much more difficult to exploit. It uses a unique hardware feature of the Sun Microsystems
SPARC SPARC (Scalable Processor Architecture) is a reduced instruction set computer (RISC) instruction set architecture originally developed by Sun Microsystems. Its design was strongly influenced by the experimental Berkeley RISC system develope ...
architecture (that being: deferred on-stack in-frame register window spill/fill) to detect modifications of return
pointers Pointer may refer to: Places * Pointer, Kentucky * Pointers, New Jersey * Pointers Airport, Wasco County, Oregon, United States * The Pointers, a pair of rocks off Antarctica People with the name * Pointer (surname), a surname (including a lis ...
(a common way for an
exploit Exploit means to take advantage of something (a person, situation, etc.) for one's own end, especially unethically or unjustifiably. Exploit can mean: *Exploitation of natural resources *Exploit (computer security) * Video game exploit *Exploitat ...
to hijack execution paths) transparently, automatically protecting all applications without requiring binary or source modifications. The performance impact is negligible, less than one percent. The resulting
gdb The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, C, C++, Objective-C, Free Pascal, Fortran, Go, and partially others. History GDB was first written by ...
issues were resolved by Mark Kettenis two years later, allowing enabling of the feature. Following this event, the StackGhost code was integrated (and optimized) into OpenBSD/SPARC.


A canary example

Normal buffer allocation for
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was intr ...
architectures and other similar architectures is shown in the
buffer overflow In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations. Buffers are areas of memo ...
entry. Here, we will show the modified process as it pertains to StackGuard. When a function is called, a stack frame is created. A stack frame is built from the end of memory to the beginning; and each stack frame is placed on the top of the stack, closest to the beginning of memory. Thus, running off the end of a piece of data in a stack frame alters data previously entered into the stack frame; and running off the end of a stack frame places data into the previous stack frame. A typical stack frame may look as below, having a
return address In postal mail, a return address is an explicit inclusion of the address of the person sending the message. It provides the recipient (and sometimes authorized intermediaries) with a means to determine how to respond to the sender of the message i ...
(RETA) placed first, followed by other control information (CTLI). (CTLI)(RETA) In C, a function may contain many different per-call data structures. Each piece of data created on call is placed in the stack frame in order, and is thus ordered from the end to the beginning of memory. Below is a hypothetical function and its stack frame. int foo() (d..)(c.........)(b...)(a...)(CTLI)(RETA) In this hypothetical situation, if more than ten bytes are written to the array , or more than 13 to the character array , the excess will overflow into integer pointer , then into integer , then into the control information, and finally the return address. By overwriting , the pointer is made to reference any position in memory, causing a read from an arbitrary address. By overwriting ''RETA'', the function can be made to execute other code (when it attempts to return), either existing functions ( ret2libc) or code written into the stack during the overflow. In a nutshell, poor handling of and , such as the unbounded strcpy() calls above, may allow an attacker to control a program by influencing the values assigned to and directly. The goal of buffer overflow protection is to detect this issue in the least intrusive way possible. This is done by removing what can be out of harms way and placing a sort of tripwire, or canary, after the buffer. Buffer overflow protection is implemented as a change to the compiler. As such, it is possible for the protection to alter the structure of the data on the stack frame. This is exactly the case in systems such as ''ProPolice''. The above function's automatic variables are rearranged more safely: arrays and are allocated first in the stack frame, which places integer and integer pointer before them in memory. So the stack frame becomes (b...)(a...)(d..)(c.........)(CTLI)(RETA) As it is impossible to move ''CTLI'' or ''RETA'' without breaking the produced code, another tactic is employed. An extra piece of information, called a "canary" (CNRY), is placed after the buffers in the stack frame. When the buffers overflow, the canary value is changed. Thus, to effectively attack the program, an attacker must leave definite indication of his attack. The stack frame is (b...)(a...)(d..)(c.........)(CNRY)(CTLI)(RETA) At the end of every function there is an instruction which continues execution from the memory address indicated by ''RETA''. Before this instruction is executed, a check of ''CNRY'' ensures it has not been altered. If the value of ''CNRY'' fails the test, program execution is ended immediately. In essence, both deliberate attacks and inadvertent programming bugs result in a program abort. The canary technique adds a few instructions of overhead for every function call with an automatic array, immediately before all dynamic buffer allocation and after dynamic buffer deallocation. The overhead generated in this technique is not significant. It does work, though, unless the canary remains unchanged. If the attacker knows that it's there, and can determine the value of the canary, they may simply copy over it with itself. This is usually difficult to arrange intentionally, and highly improbable in unintentional situations. The position of the canary is implementation specific, but it is always between the buffers and the protected data. Varied positions and lengths have varied benefits.


See also

*
Sentinel value In computer programming, a sentinel value (also referred to as a flag value, trip value, rogue value, signal value, or dummy data) is a special value in the context of an algorithm which uses its presence as a condition of termination, typically in ...
(which is not to be confused with a canary value) *
Control-flow integrity Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution (the control flow) of a program. Techniques Associated techniques include code-p ...
*
Address space layout randomization Address space layout randomization (ASLR) is a computer security technique involved in preventing exploitation of memory corruption vulnerabilities. In order to prevent an attacker from reliably jumping to, for example, a particular exploited f ...
*
Executable space protection In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit (no-execute bi ...
*
Memory debugger Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
*
Static code analysis In computer science, static program analysis (or static analysis) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution. The term ...


References


External links


The GCC 2003 Summit Proceedings
(PDF)

by Aleph One
ProPolice official home





StackGhost: Hardware Facilitated Stack Protection

FreeBSD 5.4 and 6.2 propolice implementation

Four different tricks to bypass StackShield and StackGuard protection

Stack Smashing Protector
{{DEFAULTSORT:Buffer Overflow Protection Software bugs Computer security exploits